Implement bytes.startswith in mypyc#20387
Merged
JukkaL merged 3 commits intopython:masterfrom Dec 11, 2025
Merged
Conversation
72f9ae5 to
2064d9e
Compare
e8c65fe to
1185d2f
Compare
for more information, see https://pre-commit.ci
Contributor
|
According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅ |
mypyc/lib-rt/bytes_ops.c
Outdated
| const char *self_buf = PyBytes_AS_STRING(self); | ||
| const char *subobj_buf = PyBytes_AS_STRING(subobj); | ||
|
|
||
| if (subobj_len == 0) { |
Contributor
There was a problem hiding this comment.
maybe this if check should go above the 2 PyBytes_AS_STRING lines? We can exit without those calls if the check returns true
Collaborator
Author
There was a problem hiding this comment.
Good call, updated. I split the checks around each PyBytes_GET_SIZE call a bit further to optimize for the empty-arg case. Probably won't save a ton but I don't think it makes it unreadable
JukkaL
reviewed
Dec 10, 2025
| # Test empty cases | ||
| assert test.startswith(b'') | ||
| assert b''.startswith(b'') | ||
| assert not b''.startswith(test) |
Collaborator
There was a problem hiding this comment.
Test with bytearray 1) as the receiver object and 2) the argument. This way we will also test the slow path.
Collaborator
Author
There was a problem hiding this comment.
Added a few checks to cover those as well
JukkaL
approved these changes
Dec 11, 2025
JukkaL
pushed a commit
that referenced
this pull request
Jan 15, 2026
Rounding out #20387 and implementing `bytes.endswith`. Simple benchmark shows a ~6.4x improvement. Tested with the following benchmark code: ``` import time def bench(suffix: bytes, a: list[bytes], n: int) -> int: i = 0 for x in range(n): for b in a: if b.endswith(suffix): i += 1 return i a = [b"foo", b"barasdfsf", b"foobar", b"ab", b"asrtert", b"sertyeryt"] n = 5 * 1000 * 1000 suffix = b"foo" bench(suffix, a, n) t0 = time.time() bench(suffix, a, n) td = time.time() - t0 print(f"{td}s") ``` Output: ``` $ python bench.py 0.9002199172973633s $ python -c "import bench" 0.13828086853027344s ```
michaelm-openai
pushed a commit
to michaelm-openai/mypy
that referenced
this pull request
Jan 16, 2026
Rounding out python#20387 and implementing `bytes.endswith`. Simple benchmark shows a ~6.4x improvement. Tested with the following benchmark code: ``` import time def bench(suffix: bytes, a: list[bytes], n: int) -> int: i = 0 for x in range(n): for b in a: if b.endswith(suffix): i += 1 return i a = [b"foo", b"barasdfsf", b"foobar", b"ab", b"asrtert", b"sertyeryt"] n = 5 * 1000 * 1000 suffix = b"foo" bench(suffix, a, n) t0 = time.time() bench(suffix, a, n) td = time.time() - t0 print(f"{td}s") ``` Output: ``` $ python bench.py 0.9002199172973633s $ python -c "import bench" 0.13828086853027344s ```
michaelm-openai
pushed a commit
to michaelm-openai/mypy
that referenced
this pull request
Jan 16, 2026
Rounding out python#20387 and implementing `bytes.endswith`. Simple benchmark shows a ~6.4x improvement. Tested with the following benchmark code: ``` import time def bench(suffix: bytes, a: list[bytes], n: int) -> int: i = 0 for x in range(n): for b in a: if b.endswith(suffix): i += 1 return i a = [b"foo", b"barasdfsf", b"foobar", b"ab", b"asrtert", b"sertyeryt"] n = 5 * 1000 * 1000 suffix = b"foo" bench(suffix, a, n) t0 = time.time() bench(suffix, a, n) td = time.time() - t0 print(f"{td}s") ``` Output: ``` $ python bench.py 0.9002199172973633s $ python -c "import bench" 0.13828086853027344s ```
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements
bytes.startswithin mypy. Potentially could be more efficient without relying onmemcmpbut not sure.Tested with the following benchmark code, which shows a ~6.3x performance improvement compared to standard Python:
Output: